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Abstract 

Variational data assimilation in continuous time is revisited. The cen- 
tral techniques applied in this paper are in part adopted from the theory of 
optimal nonlinear control. Alternatively, the investigated approach can be 
considered as a continuous time generalisation of what is known as weakly 
constrained four dimensional variational assimilation (WC-4DVAR) in 
the geosciences. The technique allows to assimilate trajectories in the 
case of partial observations and in the presence of model error. Several 
mathematical aspects of the approach are studied. Computationally, it 
amounts to solving a two point boundary value problem. For imperfect 
models, the trade off between small dynamical error (i.e. the trajectory 
obeys the model dynamics) and small observational error (i.e. the trajec- 
tory closely follows the observations) is investigated. For (nearly) perfect 
models, this trade off turns out to be (nearly) trivial in some sense, yet 
allowing for some dynamical error is shown to have positive effects even in 
this situation. The presented formalism is dynamical in character; no as- 
sumptions need to be made about the presence (or absence) of dynamical 
or observational noise, let alone about their statistics. 



1 Introduction 

Suppose we are given a time series 

{n t eR d 7 te[t s ,t f }} (1) 

which we will refer to as observations or measurement^. The time t is assumed 
to be continuous. Note that the observations might be higher dimensional (i.e. 
d > 1 is permitted). A model for the observations is usually defined by certain 
dynamical equations 

it = f(x t ), (2) 
y t = h(x t ), (3) 

1 in this paper, italics are used to introduce technical terms 
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where x* G M. D will be referred to as the state, and yt as the output of the model 
at time t. Entire trajectories will be denoted by {xt,t G [i s ,i/]} or simply {a;} 
if the time interval is clear from the context. 

The problem considered in this paper is to make the devations between model 
output and observations small in the interval [t s ,tf], or in other words, to find 
trajectories {x t ,t G [t s ,tf]} so that y t = T)t for all t. Different communities use 
different names for this type of problem. In the geophysical sciences, it is mostly 
referred to as "data assimilation" . Making this notion more precise obviously 
requires to quantify the term "deviation" . Mainly for computational reasons, it 
is popular to work with a quadratic measure of deviation, that is, 

At := ~ / ' \yt ~ r?*| 2 d*, (4) 
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or similar. Obviously, At, henceforth called the tracking error, depends not only 
on the model as specified in Equations but also on the initial condition 

X s . 

A few words as to the relevance of this problem might be in order. Firstly, 
we might be interested in the trajectory {x} for diagnostical purposes. If {h(x)} 
reproduces the observations well, we might reasonably hope that {x} holds clues 
about the "underlying state" of the observed system, such as regime changes. 
At least, we might learn something about the quality of our model equations. 
Secondly, the eventual goal might be to forecast future values of either observed 
or even unobserved states. If proper initial conditions are at hand, forecasts are 
generated by running the dynamical model @ into the future. Data assimilation 
provides a means to find proper initial conditions. 

At first sight, it seems reasonable to try the following variational approach 
to the problem: Choose the trajectory {x} of the model (|2I3[) so that the output 
{y} minimizes the tracking error At- A more in depth analysis though would 
reveal that, by following this route, we are very likely to run into numerical 
difficulties, since the problem we are facing has a very poor condition. This is 
looked at in detail in Section ET21 but roughly speaking, a poor condition means 
that small changes in the problem specification (e.g. small changes in {77} or the 
model /) can entail very large changes in the solution {x}. Almost all numerical 
algorithms though work by solving a series of nearby problems. If these small 
changes cause large changes in the corresponding solutions, the algorithm will 
have trouble to converge. As an aside, this problem is not restricted to instable 
systems, as we will see. A stable system creates essentially the same problems as 
an instable system. If instability was really the only cause of trouble, a reversal 
of time would solve the issue. 

Notwithstanding these concerns, the abovementioned approach has been 
considered in a large number of publications. In the geosciences, this data 
assimilation technique is known as (strongly constrained) four dimensional vari- 
ational data assimilarion, or 4D-VAR; Several dynamical weather forecast- 
ing systems employ this approach operationally, and certainly with success. 
For a discus sion of the approach and applic a tions to geophysical syst e ms se e 
for example Le Dimet and Talagrandl (Il986 k Talagrand and Courtier ( 1987 ): 



Courtier and Talagrandj(jl987l) : lRabier et al.l (jl993l):IPires et all rtl996h:lApte et al 
( 200 S]) , a list which i s by no means exhaustive. I nlFarmer and Sidorovichl (|l99dh ; 



Brocker and Parlitzj (|200lh ; iRidout and Juddl (|2002i a very similar approach 
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was considered (at least as a motivation), albeit from a nonlinear dynamics' 
perspective; furthermore, different solution strategies were employed. Several 
interesting observations were reported. These observations allow for an interpre- 
tation within the framework of the present paper, to be discussed in Section [SJ 
This discussion might also pertain (at least partly) to operational implementa- 
tions of 4D-VAR, and hopefully explain why the strongly constrained 4D-Var 
approach works to some extent, despite the abovementioned reservations. 

The technique considered in this paper is m otivated by similar investiga- 



tions in the e ngineering literature (see for example ISagel . Il968 ; Jazwinskv . 1970; 
Sontag, 1998 ). as well as a recent paper bv lJuddl ( 2008bh . In the geosciences, the 



shortcomings of 4D-VAR lead to the developement of weakly constrained (WC) 
4D-VAR, which might be unde r stood as a special case of the follow ing approach 
(see for example iDerberl . Il989l ; iTremoled . 120061: lApte et all . l2008h . Consider a 
modified model 



x t 
Vi 



f(x t ,u t ), 
h(x t ), 



(5) 
(6) 



which allows for a dynamical perturbation ut, with the idea that f{xt, 0), the free 
running system, constitutes our ideal model. This means that we expect truth to 
behave essentially as described by the vector field /(•, 0), which encompasses all 
our A priori understanding of the problem at hand. In particular, /(•, 0) is the 
model we might want to use in ord er to extrap olate trajectories into the future. 



In the theory of optimal control (jSagd . 119681 ). ut plays the role of a control 
parameter which allows to be manipulated externally, in order to force either the 
state Xt or the output r\t to exhibit some desired behaviour. The word "control" 
though has an entirely different meaning in geophysical modeling. Further to 
that, the interpretation of u t in the present problem is different, and more that 
of a model imperfection or error, whence the word "dynamical perturbation" 
was chosen here for u t . This interpretation might suggest that the existence 
of a perfect model is assumed, and that in the (near) perfect model case, u t is 
not needed. The presented methodology however has no need to assume the 
existence of a perfect or even near perfect model. Furthermore, it will become 
clear that there are good reasons to allow for some dynamical perturbation ut 
even if the free running model /(-,0) has in fact generated the data (perfect 
model case). 

Whatever the interpretation of Ut, we are striving to reproduce the obser- 
vations without too large deviations from free running model /(-,0), that is, 
without too large dynamical perturbations. Hence the variational approach is 
modified by adding a further term to At, referred to as modelling error, pe- 
nalizing large dynamical perturbations, that is, large deviations from the ideal 
model /(-,0). More specifically, we consider the action integral 



(1- 
(1 



a)A T H 
-a) 



uA 



M 



(vt - yt) T R(vt 



a 

Vt)dt + - 



uT Sutdt. 



(7) 



Here a £ [0, 1] is a weighting parameter. Varying a allows to explore the trade- 
off between the desired goals of minimizing the tracking error and keeping the 
dynamical perturbations small. The positive definite matrices R and S help to 
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normalize the different components of observational and dynamical errors in At 
and Am to about the same orders of magnitude. The factor 1/2 simplifies sub- 
sequent expressions. Several authors have interpreted R and S, respectively, as 
(the inverse of) observational and dynamical error covariances. More generally, 
an action integral of the form (JT]) can be motivated as a m aximum-likelihood- 
approach to the smoothing problem (see Ijazwinskv . 1970, , Chap. 5, Sec. 3 for 



details). This interpretation though relies on a large number of assumptions 
which are hard to verify or even outright wrong. At the same time, it is not 
clear what the benefits of such a stochastic interpretation would be in the present 
context. 

In fact, other reasonable options for At and Am are conceivable which would 
not have an immediate interpretation in stochastic terms. For example, setting 

A M = jf ' (/ (x t ,u t ) ~ f (x t , 0)) T S (/ (x t ,ut) - f (x u 0)) dt (8) 

would directly measure the deviations from the ideal model. In addition, R 
and S could depend on t, too, thereby weighting different observations and 
dynamical perturbations differently (e.g. if observations from the remote past are 
deemed less important than more recent ones) . Further examples are considered 
in Section [5] 

This introductory section will finish with an overview over the following 
sections. The next section recalls the basic facts from the calculus of varia- 
tions, deriving a necessary condition for (Q to have a minimum under the con- 
straints (|5I6|) . This necessary condition has the form of a Hamiltonian two point 
boundary value problem. In Section [3l properties of the solution are discussed. 
In particular, an interpretation of the Lagrange multiplier is given. It turns out 
that the Lagrange multiplier conveys important information as to the stabil- 
ity of the solution. Section @] briefly explains approaches to solve the two point 
boundary value problem as well as other technical aspects of the approach. Most 
importantly, the concept of continuation is discussed. In Setion[5j some related 
work is discusse d, in particula r the connection between the present paper and 



a recent work by lJudd ( 2008b ) . Finally, Setion [5] discusses observations from a 



numerical experiment. 



2 Necessary conditions for an optimum 



The reader is r eminded of some of the basic facts from variational calculus (for 
details, see e.g. Sage . 19681) . Let the Lagrangian L(x,u,t) be a function of time 
t and the variables x and u. The task is to minimize the action integral 



A{{x,u}) 



L(x t ,u tl t)dt 



(9) 



over all trajectories {x, u} that satisfy a dynamical constraint of the form 

x t = f{x t ,u t ) (10) 

and maybe the initial condition 

xu=Z, (11) 
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where £ is given. In this paper, Lagrangians will mostly be of the form 

L = (r? t - h(x)) T R{r) t - h{x)) + ^u T Su (12) 

or similar. 

We will consider both the case of a specified initial condition and the case 
where the initial condition is free and hence subject to optimisation. These two 
cases will be referred to as fixed initial condition problem and free initial con- 
dition problem, respectively. Any trajectory {x, u} that satisfies Equation (|10[) 
(and also (jlip in the fixed initial condition problem) is called a feasible trajec- 
tory. Let {xt-,Ut} denote a minimizing trajectory, that is, a feasible trajectory 
with the property that A({it, u t }) < A{{x, u}) for any other feasible trajectory 
{x, u}. By A = A({xt,ut}), we will denote the optimum value of the action. In 
order to formulate a necessary condition for {i t ,u t }, define the Hamiltonian 

H(x, u, A, t) := L(x, u, t) + X T f(x, u). (13) 

Then for {xt,ut} to be a minimizing trajectory, it is necessary that {it,Ut, Xt} 
is a stationary point of the augmented action 

A({x,u, A}) := / H(xt,u t ,\t,t) - \ t x t 



f 



L(x t ,u t ,t) - Xf [x t - f(x t ,u t )] dt. (14) 

ts 



Using the calculus of variations, it can be shown that in order for {it, lit, At} to 
be a stationary point of the augmented action, it is necessary that they solve 
the two point boundary value problem 

lt +Xt = °' (15) 
dH t 



dX 



-it = 0, (16) 

^ = 0, 
du 

At. = 0, (b) x t . = £, (18) 

X tf = 0. (19) 



(17) 



Here, initial condition I\l8h ) holds for the problem with free initial condition, 
while (|18b ) holds for a fixed initial condition. The corresponding boundary 
conditions for problems with either both ends or even only the final value of 
{x} specified should be evident. Sometimes, instead of boundary conditions 
like (|18I19|) . additional penalty terms of the form 

A s :=(x t3 -^) T B s (x ts ~ Cs), A F :={x tf -i f ) T Bf{x tt -if) (20) 

appear in the action integral. In control theory, such terms are referred to as 
start or end point penalties, while in data assimilation the term As is called 
background error. If present, such terms lead to boundary conditions of the 
form 

A ts = B s {x tB - 6), A t/ = B f (x tf - i f ). (21) 
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Inserting the definition of the Hamiltonian into Equation (|T5|) gives back the 
original dynamical condition (|2J|, while Equation (|I5I) can be written as 



dx J \ dx 

The coupling between the dynamical relations (fT5|) and (fTB]) is furnished 
by the static condition (|I7[) . The role of this condition becomes much more 
prominent if further inequality conditions are imposed on the dynamical per- 
turbatio ns, in which case the celeb rated Pontryagin Maximum Principle (see for 
example ISontad. Il998t ISagel Il968l ) states that condition JI7J is to be replaced 



by 

H(x t ,ut,Xt,t)<H(x t ,w,X t ,t) (22) 

for all feasible perturbations w. If there are no further inequality constrains 
on the dynamical perturbations, the Maximum Principle obviously agrees with 
condition (1171) . It is very convenient if the problem is set up in a way that 
allows for condition (|17[) to be solved for u t , which can then be eliminated from 
the entire problem. To simplify the subsequent discussion, we will from now on 
assume this situation. The vector field 

*< A -> = (-f-f)<^> < 23 > 

is called the Hamiltonian vector field, where it is understood that on the right 
hand side, the dynamical perturbation u has been expressed as a function of A 
and x, using the condition (JTTJ) . 

Although the stationarity conditions (I15M19[) are but necessary conditions for 
a minimum of the action, they provide us with a practical means to calculate 
solution candidates. What we have gained is that the minimisation under a 
D-dimensional differential equation as constraint has been replaced by a 2D- 
dimensional differential equation. Note however that this differential equation 
is not an initial value problem, but rather a two point boundary value problem. 
In an initial value problem, the entire state (x t , A t ) would be specified at t = t s , 
while in the present two point boundary value problem, the state is specified in 
part at the beginning and in part at the end of the time window. 

We will finish this section with deriving the Hamiltonian equations for the 
case we are mainly interested in, which is a quadratic Lagrangian as in Equa- 
tion (0 and a dynamical system of the form 

it = f(x t ) + u t , y t = Cx t , (24) 

That is, perturbations to all degrees of freedom, and linear output. The Hamil- 
tonian equations for this setup read as 

A t = -Df(x t ) T X t +C T R( Vt -Cx t ), (25) 
it = f(x t )+u t , (26) 

1 — IT 

u t = -S^Xt (27) 

a 

The following facts are easily derived from the system (|25H27p . Firstly, if 
there is a trajectory {x} of the free running system (i.e. with u± = 0) so that 
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Cxt — rj t for all t, then this trajectory is also a solution of (|25ti27|) with free initial 
condition. The corresponding At vanishes. In other words, in the perfect model 
scenario, we get the desired solution. Secondly, the boundary condition (fig)) 
implies that the dynamical perturbation vanishes at tf. This has implications for 
the forecasting problem. Unless some way is found to extrapolate the dynamical 
perturbation u t into the future (i.e. beyond tf), forecasts for rj t and Xt are 
generated using the free running model f(x) in (|24p . The fact that u tf = 
means a smooth transition from the reconstructed orbit {xt,t € [t s ,tf]} to 
the forecast orbit {xt,t > tf}. This feature presumably yields more realistic 
trajectories than if the dynamical perturbations were switched off suddenly. 
Thirdly, if {A, x} is a solution of (|25H27p . then {cA, x} is also a solution but with 
oR, cS replacing the matrices R and S, where c € R. In fact, this is true in 
general: If cL replaces the original Lagrangian L, then {A, x] has to be replaced 
with {c\,x}. Finally, we see from Equation (|2T|) that the case of no dynamical 
perturbations, in other words, the "naive" minimisation problem considered at 
the very beginning of this paper, in which At (Equ. [4]) is minimized subject to 
the dynamical system Equation (PIS]), is equivalent to taking the limit a 1. 
Note that this corresponds to assigning a strong weight to the modelling error 
Am- In this situation, there is only unidirectional coupling in (|25[) . namely 
from xt to At . This has led several investigators to ignore At . As will be seen in 
the next section though, At conveys interesting information in any optimisation 
problem under constraints. 



3 Lagrange multipliers and dynamical pertur- 
bations 

A central motivation for the presented approach was that it allows for a reg- 
ularisation of the problem, even in a situation in which the proposed model 
has actually generated the data. This aspect is looked at in detail in Subsec- 
tion after a few general facts on Lagrange multipliers have been presented 
in Subsection 13. II 



3.1 Interpretation of the Lagrange multiplier 

The Lagrange multipliers are not just a mere byproduct of the presented ap- 
proach but provide useful information. In any optimisation problem under con- 
straints, a Lagrange multiplier corresponding to a specific constraint describes 
the derivative of the objective function with respect to that constraint at the 
optimum. This fact is a consequence of a somewhat more general result on 
parameter dependencies of the objective function at the optimum, of which we 
give an informal derivation. By A and A, respectively, we denote the functions 
A and A (i.e. the action and the augmented action, resp.) at the optimum. Sup- 
pose the Hamiltonian H in Equation (|13[) depends on an additional parameter 
9. Assume that small but otherwise arbitrary perturbations are applied to both 
9 as well as the functions {x, u, A}. The total first order change in A is given by 

qA 

S tot A = 5A + —59 (28) 
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where 8 A denotes first order changes due to perturbations of {x, u, A}. If A is 
at optimum, then 6 A — 0, so that in this situation, 

f)A 

S tot A = —60 (29) 
ad 

From the definition of A, it follows that at the optimum 

A = A, (30) 

identically in 8, so that for the total derivatives 



d.4 _ dA 

Combining this with Equation (|29p . we obtain 



dA _ dA 
~dl ~ ~dJ 



(31) 



(32) 



We apply this to the situation where the dynamics is given by 

x t = f(x t ,u u t) + 0r t (33) 
with r t some function of time. Equation (|32j) in conjunction with (1141) yields 

a a r 

— (0 = 0) = J Xjr t dt, (34) 

demonstrating that the Lagrange multiplier describes the first order changes 
of the objective function at the optimum with respect to perturbations of the 
constraints (i.e. the dynamics). 



3.2 Motivation for using dynamical perturbations 

In this subsection, the effect of dynamical perturbations on the tracking problem 
is investigated. In a nutshell, it will turn out that without dynamical perturba- 
tions, relatively small changes of the dynamical system as well as of the obser- 
vations can have relatively large effects on the optimal trajectory. Allowing for 
dynamical perturbations will be seen to alleviate these effects. For simplicity's 
sake, we will study an example that is in fact not a dynamical system. Consider 
the problem 

Minimize A(x) := —{x — rf) 2 subject to Mx = 6, (35) 

where x, z € R™, M is an arbitrary m x n matrix, and S € M. m . A "dynamical 
perturbation" u (which is in fact static in this case) can be introduced into this 
problem by modifying it thus: 

Minimize A(x, u) := —(x — r\) 2 + -u 2 subject to Mx = bu + S, (36) 
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where & is a scalar factor and u £ R m . The solution to this problem is most 
conveniently described by letting M = W T SV be the singular value decompo- 
sition of M , with W an m x m orthonormal matrix, V an n x n orthonormal 
matrix, and S an m x n matrix of the form 



S = 



I <Tl 



V o 



a,- > 0. 



Introducing x' :— Vx, u' :— Wu, 5' := WS, rj ■= Vr] and using Lagrange 
multipliers, the problem becomes one of finding a stationary point of 

A(x, u, A) = i(ac - I]) 2 + -it 2 - X T [Sx - bu - 5]. 

(Here we have switched back to our old notation, ommitting the dashes.) It is 
straight forward to verify that 

(b 2 r]i + aiSi) , if a t > 

(61) 

r)i, else 

Xi = — j - cTjTji) , i = 1 . . . m (38) 
tr + of 



1 m 1 

i = 2^^T^ Mi ~^ (39) 

Obviously, A^ = dA/dSi, as was proved in the last subsection. We are interested 
in the effect of small changes in the data 8, rj on the solution of problem (|3l)f . A 
serious difficulty that appears in data assimilation (and other contexts) is that 
the singular values <7j range over many orders of magnitude. This can render 
the smaller singular values indistinguishable from zero. Keeping this in mind, 
the following conclusions can be drawn from Equations (|37H39t : 

1. If b > 0, then x, A and A change continuously if <7i — > 0. Furthermore, the 
derivatives of x, A and A with respect to S, rj remain finite. 

2. Letting b — > 0, we have to assume that Si — whenever cr, = 0. We obtain 

[ rn if a* = 0, 5i = 
A, = ^-U--) (41) 

2=1 X 7 

It is seen that the solution is discontinuous for cr; — > 0. Moreover, the 
influence of <5, on the solution becomes infinite. If 6i ^ for (Tj = 0, the 
problem becomes infeasible. 

It is worth pointing out that a nonzero b does not bring the solution closer to 
any presumed true solution. It merely ensures that small changes in the prob- 
lem formulation result in small changes in the solution. Numerical approaches 
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to data assimilation work by solving a series of nearby problems of the form 
considered in this subsection. If this series of nearby problems does not result 
in a series of nearby solutions, the approach will fail to converge. These con- 
siderations motivate the introduction of dynamical perturbations u not only if 
model error is presumed present, but even if the model has actually generated 
the data. 



4 Solving BVP's and continuation 

Two point boundary value problems (BVP's) have been thoroughly investigated, 
and a multitude of algorithms and solution techniques exist. Subsection 14.11 
contains a few words on solution techniques and points to some references on 
this subject. The Subsection explains continuation, an important and useful 
technique when working with BVP's. 



4.1 BVP solution techniques 

Although the Hamiltonian BVP (|15H19[) involves ordinary differential equa- 
tions (ODE's), it is not an initial value problem, which is what most ODE 
solvers are intended for. To solve BVP's, different strategies are needed. For 
a c omprehensive collec tion of various numerical techniques for solving BVP's, 



see 



Childs et all ll 1979ft . 

A general approach consists of approximating the flow over small time inter- 
vals to translate the original problem into a BVP in discrete time — essentially 
a large system of nonlinear equations. For the moment, we are not using the 
fact that the vector field is Hamiltonian, so we will be assuming (until further 
notice) that the BVP is given as 

Zt = F{t,z t ), t€[t a ,t f ], b(z t3 ,z tf ) = 0, (43) 

with z t € M. D , F a vector field on R 15 and &(.,..) a function representing the 
boundary conditions. BVP solvers approximate z t at a series t s = to < . . . < 
fjv = tf of temporal mesh points; we will write z :— (zt , ■ ■ ■ , Zt N ). Usually, z is 
obtained by solving a set of equations 

*iOO = > i = 0...N, (44) 

called collocation equations. The collocation equations are effectively discrete 
time approximations of the original differential equation as well as the boundary 
conditions in (l43l) . Note that the mesh points need not be equal to the time 
instances at which the observations {rj} were sampled. Typically, the collocation 
equations are solved using a quasi-Newton type algorithm, which involves a 
numerical approximation to the Jacobian of t he collocation equations. As an 
example for a BVP solver, the code bvp4c bv iKierzenka and Shamnmel ( 2001 ) 



implemented in Matlab uses a fourth order implicit Runge-Kutta scheme to 
approximate the differential relation. 

The collocation approach was used for the numerical experiments in this 
paper. This approach, albeit fast and reliable, becomes prohibitively expensive 
for larger systems, since the entire system of collocation equations is solved 
simultaneously. Even if the sparsity of the collocation equations is exploited, 
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this approach is unlikely to work for relevant geophysical problems. Hamiltonian 
BVP's though c an be solved sequentially by the method of invariant imbedding, 
see for example ISagel ( 19681 ). Whether this route yields a feasible approach for 



large scale systems will be subject to future research. 

One might contemplate a completely different approach in which the original 
dynamics (|5l6p are linearized, while the action integral is expanded to second 
order (if not already in this form). The result is a so-called linear-quadratic 
programme. The linear Hamiltonian equations obtained in this way though are 
not the same as the linearized original Hamiltonian equations. In fact, this 
approach neglects important curvature terms and might even render a perfectly 
solvable problem infeasible. This well known phenomenon is discussed in many 
textbooks on nonlinear programming. This approach is not recommended. 



4.2 Continuation 

An important technique when dealing with BVP's (or indeed with any type of 
equation solving) is continuation. Suppose the entire problem depends on a 
parameter a. To every value of a, there corresponds a solution {x(a), A(a)}. 
Sometimes a solution is easily obtained for a specific oto, while finding a solution 
at the desired value a± is hard. This situation calls for continuation, which 
means to first calculate a solution at the "easy" value ao and then use that 
solution as an initialisation for a new run with a slightly altered a, hoping that 
the solution for the new value of a is still reasonably close to the solution for 
q.q. This process is repeated, gradually moving a towards the desired value a.\. 
For continuation, it is most useful if the employed BVP solver accepts initial 
guesses in exactly the same format as it produces solutions, as is the case for 
bvp4c. 

In this paper, we will use continuation with respect to the mutual weighting 
between observational errors and dynamical errors. To be specific, the action 
integral is written as A a = (1 — a) At + a Am with some fixed positive definite 
matrices R,S. Starting with a very small model error penalty (i.e. a = 0), 
the solution is obviously any trajectory {x. A} so that h(xt) = rj t (and At some 
large values) for all t. Hence, a = appears to be a good starting value for 
continuation. 

Another interesting option is continuation with respect to time. The interval 
[i s ,i/] can be divided into subintervals on which the BVP gets solved individu- 
ally. Depending on which boundary conditions are used, either the dynamical 
perturbations or the orbit itself can be rendered smooth at the endpoints of 
the individual intervals. Splicing together the individual solutions should pro- 
vide a fairly good initial guess for the optimal orbit on the entire interval. 
Alternatively, a realtively short interval [t s ,tf] might be computed first. With 
new measurements 77 being aquired, the interval gets sequentially incremented. 
Again, previous solutions projected into the future should provide a good initial 
guess for the updated solution. 



5 Relation to previous work 

As was mentioned in the introduction, minimising the tracking error At subject 
to the model Il2l.''>[) without dynamical perturbation was considered in several 
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publications (outside the meteorological community ) 
for noise reduction bv lFarmer and Sidorovich (1990). 



for example as a means 
Noise reduction refers to 
the following problem. It is assumed that rj t — x t + £t, where the dynamics 
of governing x t are an identical copy of the model, and £t is white noise. In 
other words, the model is indeed perfect, up to observational errors. Also, it 
is assumed t hat C is the identity matrix, that is, there are no hidden state 
variables. In iFarmer and Sidorovichl ( 19901 ) and all other papers referred to in 
this subsection, the time is discrete. In discrete time, application of standard 
Lagrange multiplier theory leads directly to the collocation equations. In order 
to keep the discussion as streamlined as possible, we will stay with continuous 
time here, tacitly apply i ng the necessary modifications without any further note. 
Farmer and Sidorovich (Il990l ) approach the Hamiltonian BVP using what the 
authors call "manifold decomposition" . This is an iterative procedure whereby 
the dynamics @ is linearised about a trial solution and decomposed into the 
(linear) stable and unstable directions. Corrections to the trial solution are 
obtained by solving the linearized system forward in time along the stable and 
backward in time along the unstable directions. The Lagrange multipliers Xt 
are ignored. Several difficulties of this approach are readily acknowledged, most 
importantly, the bad condition of the problem, and that the solution is not in 
fact a solution of the as posed initially. The algorithm merely finds a solution of 
the model dynamics, starting at the observations and proceeding in steps which 

are g uaranteed to be small . 

In Brocker and Parlitj (|200ll ) and iRidout and Juddl 
proble m is discussed, the approach though is different from 



2002), a very similar 



(|l990h . Both papers propose to minimize the indeterminism 



Farmer and Sidorovich 



I = - 



I it - f(x t )\dt, 



(45) 



which, in the present termi nology, is the modelling e rror A m. To minimize (1451) , 
gradie nt descent is used bv lRidout and Judd ( 2002 ). while Brocker and Parlitz 
(|200ll ) employ a simplified Newton-Raphson strategy. 

In all three works, the observations come in only as an initial trial solution 
for {x} (recall that in both papers, Cx — x). In subsequent iteration steps, 
the observations are ignored. This seems to be rather counterintuitive: To 
minimise the tracking error At, you ignore the observations and minimize the 
dynamical error Am instead. Nonetheless, the approach appears to work well, 
at least if model error is absent. Using the methodolo gy presented in this paper , 
an interpretation of this so mewhat curious finding of IRidout and Juddl (|2002h : 
Brocker and Parlitj (|200lh is hazarded. It seems that the solution strategy 
corresponds to a (somewhat crude) continuation scheme. Recall that a scheme 
without dynamical perturbation corresponds to a large model error penalty, or a 
large ratio In other words, the observations should eventually be ignored. 

Obviously, the observations cannot be ignored altogether. As was discussed, 
problems for large a need to be solved using continuation, starting with small a. 
The mentioned papers assume that full state information is available, hence the 
Equations ([231 - 123 |) can be written as 



A* 

a (f( x t) ~ it) 



-Bf(x t ) T \ t + {l-a) {nt-xt) 



A, 



(46) 
(47) 
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The so l ution for small a is jx, A} = {77, 0}, exactly what is used bv lBrocker and Parlitz 
(|200lh : iRidout and Juddl (|2002l ) to initialise their algorithms. We can conclude 
that the solution strategy in these publications corresponds to a continuation 
scheme in which a is moved from to 1, which is appropriate for small dynamical 
errors. 

The remarks of this section might as well apply to (strongly constrained) 
4D-VAR. The author certainly cannot claim to have sufficient knowledge of 
the specifics of 4D-VAR implementations, which certainly differ substantially 
from the algorithms used in the publications discussed in the present subsection. 
Considering the discussed difficulties though that would be encountered if the 
strong constraint in 4D-VAR were taken "too literally", it is speculated that 
successful implementations use in fact some kind of weakened strong constraint, 
either with or without acknoledging this fact. Therefore, some of the remarks 
of this section might as well perta i n to ( not so) strongly constrained 4D-VAR. 
The ideas of IRidout and Juddl (120021) were subst antially expanded up on in 



a series of papers bv Kevin Judd (I Juddl . l2003l l2008alibh . We will focus on Judd 
( 2008bh . as it developes ideas not unsimilar to the present setup. In that paper, 
is proposed to minimize the tracking error At subject to 



it - f{x t ) = u t 



and 



A M < S. 



(48) 



Several options for th e exact form of At and Am are discussed. In particular 
for ,4 m. I Juddl (|2008bl) suggests either a mean quadratic error as considered here 
(Equ. HJ) or a maximum deviation of the form Am = max t ufSu t . The first case 
is equivalent to minimizing an error functional A = (1 — a) At + a Am with an 
a depending on S. Hence this case is a special case of the situation considered 
in this paper. Using a maximum deviation for Am is equivalent to imposing an 
inequality constraint of the form uj Su t < S, which in continuous time requires 
appl ying the Pont ryagin Maximum Principle, as was discussed in Section [TJ As 
said, I Juddl (l2008bh considers only discrete time, and the minimisation problem is 
solved using the standard Lagrange multiplier method. The necessary conditions 
directly yield the collocation equations. 



6 Observations from a numerical experiment 
6.1 A prop for reality, and the model 

In this section, results from a few numerical experiments will be discussed. As 
a prop for reality, consider 

X t = F(X t ) + as t r h = CX t +pr t , (49) 

where for the dynamics F we will consider the Lorenz '63 system as well as 
the Lorenz '96 system. We will use a capital letters like X t for the reality, 
while Xt is reserved for model trajectories. The quantities St and r t are random 
perturbations to the dynamics and the observations, respectively. Their exact 
form is discussed below. 

The model we plan to use for tracking r\t will in both cases be of the form 

x t = f(x t )+u t , y t = Cx t , (50) 



13 



where / might be different from F. The action integral A a is as specified 
in Equation ([7]), with R and S being diagonal matrices defined as follows. Let 
var{a;} be the temporal variance of some time series {x}. If {x} is vector valued, 
then var{x} is understood component-wise, with the individual components 
being denoted by var{x}j. With this definition, we set 

Ru = l/va,r{Cx} l: S« = l/var{/(a;)} i . (51) 

These constants are calculated off-line with {x} being some typical trajectory 
of the model. 

The random perturbations r t and St are supposed to account for model er- 
ror. They are often referred to as dynamical and observation error, respectively. 
There is some sort of agreement in the community that taking these perturba- 
tions as white noise processes is an acceptable choice. Notwithstanding the fact 
that in practice, model deficiencies are usually anything but white noise, we 
will essentially stay with this custom in order to obtain comparable results, al- 
beit with some modifications necessitated by the present situation. Using white 
noise in conjunction with dynamical systems in continuous time requires some 
caution. It is well known that if a vector field is subjected to white noise pertur- 
bations, the standard theory of ordinary differential equations ceases to apply, 
and stochastic analysis has to be used. This concerns not only our prop for 
reality fEau. I4T)]) . but the Hamiltonian equations (I15H19[) as well, since they are 
subjected to the input which contains white noise components. In practice 
though, the observed signal will always be discretely sampled. The sampling 
mesh however need not be appropriate for the B VP-solver, which for numeri- 
cal reasons might need a finer mesh. In this situation, it becomes necessary to 
interpolate the observations at intermediate time points. In order to guarantee 
that the interpolated observations are sufficiently regular, we will assume once 
and for all that {rj} has a spectrum limited to frequencies smaller in magnitude 
than some bandwidth /3Hz. This bandwidth has to be chosen as part of the data 
assimilation procedure and hence becomes a component of the model. A way 
to ensure a bandwidth limited continuous signal {77} is to sample the original 
observations with a rate of 2/3 and interpolate intermediate values. Note that 
the approach is not only operationally feasible, but typically the only option in 
practice, as the observations are usually given as a set of discrete observations 
with fixed sampling rate. 

Solutions were computed using continuation with respect to a, starting with 
a = 0, gradually increasing a to the desired value. Here and also in the presen- 
tation of the results below, it proved useful to introduce the inverse logit 7 € Ml 
of a via 7 := log(a) — log(l — a). This transformation maps the unit interval 
monotonously onto the real line. When speaking about "plotting results versus 
a", we really mean plotting the results versus 7(a). 

The quality of the solutions was evaluated in terms of the quadratic deviation 
to the real solution X tl that is, 

A A ■= [ (x t - X t )Q(x t - X t fdt, (52) 
Jo 

a quantity which we refer to as the assimilation error. Here, Q is given by a 
diagonal matrix with entries 

Qu := l/vax{x}i. (53) 
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It is worth noting though that in real world applications, X t is not known (or 
does not exist, depending on one's philosophical stance). Hence, A a cannot be 
the basis for quality assessment under operational circumstances. 

In the situations studied here, Aa was found to feature a unique minimum. 
That such a minimum must exist is quite plausible. For a — > 0, a vanishing 
penalty is imposed on the dynamical perturbations u t , but a strong penalty on 
the observational error. The solution will follow the observations very closely, 
thereby modelling not only the observations themselves but also the observa- 
tional noise. If dynamical noise is present, this requires a nonvanishing u t , but 
this has no effect on the cost functional A. Thus, the algorithm can use large u t 
with impunity. A large Ut is however costly in terms of Aa, since the orbit {x} 
will not be a solution of the free running model, and therefore, will deviate from 
{X}. On the other hand, for a — > 1, the observations are ignored, and again, if 
dynamical noise is present, the solution will soon deviate from the true trajec- 
tory. These deviations have no effect on the cost functional A Q , but of course, 
they do have an adverse effect on the assimilation error Aa- So somewhere in 
between, there should occur a minimum of Aa- 



6.2 Numerical results for Lorenz'63 

The Lorenz '63 system is given by the vector field 

f{x) = 28x« - x^ - X W X (3) . (54) 

\ X W X &) - | x (3) J 

In these experiments, we used F — f, but note that still a significant model 
error is present, due to the dynamical noise in the prop for reality. The tracking 
problem was investigated for a selection of observational and dynamical noise 
amplitudes (see Tab. [IJ, and a few representative results are presented here. 
The signal-to-noise ratio (SNR) is defined as the energy of the signal versus the 
energy of the noise. More specifically, we set a and p as diagonal matrices so 
that 

SNR D :=101og 10 Var{f f )} ' (55) 
for the dynamical noise, and 

SNR := 10 log 10 ^M (5 6 ) 
P 

for the observational noise. The noise amplitudes are defined with respect to 
some typical trajectory {X} of the true dynamics. The signal rj t was sam- 
pled with 20Hz, the total assimilation window comprised 256 points, whence 
t £ [0, 12.8]. Continuation was carried out in steps of 7 of not more than .5, 
commencing with 7 = —8, which corresponds to a = exp(— 8). 

In Figurcs[TH31 results are shown for one experiment with SNRd = 5dB, SNRo 
5dB. In all figures, the three panels on the left hand side contain the three com- 
ponents of the original trajectory {X} over time (in grey) and the three com- 
ponents of the assimilated trajectory {x} over time (in black). The right hand 
side, upper panel, shows the three components of the dynamical perturbations 
over time. The y-axis has been scaled with 1/^/S. This means that dynamical 
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perturbations of magnitude 1 in this plot are of similar amplitude as the model 
vector held. The right hand side, lower panel, shows the observations (black 
narrow line with diamonds), the output y t — Cxt (black wide line), as well as 
the noise-free observations CX t (grey wide line). For illustrative purposes, this 
panel zooms out a subinterval of the entire time span. Figures [I] [3] only differ 
in terms of the values of a used. Figure Q] shows the results for 7 = — 8, which 
corresponds to a = 3.35 • 10~ 4 . Figure [5] shows the results for 7 = 0, which 
corresponds to a = .5. This value for a gave the smallest assimilation error (see 
below). Figure |3] shows the results for 7 = 3.5, which corresponds to a = 0.97. 

A few interesting facts emerge. As expected, the trajectories become smoother 
with increasing values of a as they follow the unperturbed model dynamics to 
a higher and higher degree. The hrst component in particular starts off with 
following the noisy observations rather closely for small a, becoming progres- 
sively smoother with increasing a. For a = .97, the output looks essentially like 
that of an unperturbed Lorenz'63 system. The plot of the dynamical perturba- 
tions (upper panel in the right hand column) shows decreasing amplitudes for 
increasing a. 

As was discussed earlier, we should expect the deviation between the true 
{X} and the assimilated trajectory {x} to be minimal for some a. This effect 
can be readily discerned from Figures HH3] For small a (Fig. [IJ, the assimilated 
trajectory is very irregular due to the observational noise being assimilated into 
the system. For medium a (Fig. [5]), the assimilated trajectory is quite regular 
and closely follows the true trajectory. For large a (Fig. [3]), the deviations start 
to increase again, apparently due to model error kicking in. This is evident for 
example at t = 8.5, where true {X} and assimilated trajectory {x} slip out of 
phase. 

Figures I4UTT1 display the various errors Aa,Am, and At- Each figure cor- 
responds to a specific pair of observational and dynamical noise strength (see 
figure captions and Table [T]). In all figures, the assimilation error Aa fEq. [5^)1 
is plotted with a wide black line, the tracking error At and the modeling error 
Am (Eq. are plotted with a dashed line and a dash-dotted line, respectively. 
The grey line represents the assimilation error in the first component only (de- 
noted as A\ ) , namely the error between the model output and the observations 
without observational noise. This line was also shifted by a constant to allow 
for comparison with the (total) assimilation error A a- 

As expected, the modeling error decreases with increasing a, while the track- 
ing error increases. Across different experiments, these two quantities, albeit 
very similar, are not equal. For small a, the modeling error becomes very large 
but settles off at a finite value. The limiting values have been collected in Ta- 
ble [TJ The assimilation error shows a clear minimum in all cases. The position 
of the minimum depends on both the dynamical and the observational noise 
strength. The presented and further numerical experiments indicate that the 
minimum of the assimilation error moves towards larger values of a with increas- 
ing observational noise, and towards smaller values with increasing dynamical 
noise. The behaviour of the assimilation error of the first component A\ (dot- 
ted line) is qualitatively similar to A a- In particular, the minimum of A\ with 
respect to a is close to the minimum of A a ■ 

Of course, Aa and A\ would not be available under operational circum- 
stances. The question arises whether the quantitative behaviour of Aa can be 
predicted in advance, and in particular, whether the optimiser a can be iden- 
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Dyn. Noise Obs. Noise Am(o: = 0) Figure 



3 3 6 m 

4 4 5.5 E 

4 6 3.2 E 

5 5 4.5 

6 4 4.5 El 

7 5 3.5 |9] 

7 7 2.5 [TD] 

8 6 3.4 El 



Table 1: Investigated values for the dynamical and observational noise, limiting 
values for the modeling error Am for a 0, and corresponding figures. 



tified from properties of the model alone (plus some assumptions about the 
perturbations) . It has been suggested that for 4D-VAR, the weighting matrices 
R and S, respectively, should be chosen as the inverse of the observational and 
dynamical error covariances, respectively. It is hard to see how the error covari- 
ances could be determined in an independent fashion though (letting alone the 
question as to whether a nonlinear vector field with white noise perturbations is 
an appropriate model). Therefore, appropriate weighting matrices R and S (or 
a) need to be determined using some problem oriented criterion of solution qual- 
ity. Finding criteria is an important problem whi ch requires further research 



An op erationally feasible approach is presented in iBrocker and Szendro Teran 



(2010). The resulting weighting matrices R and S could be referred to as A 
posteriori effective error covariances. 



6.3 Numerical results for Lorenz'96 

Very similar experiments were carried out for the Lorenz'96 system. The Lorenz '96 
system has two time scales, the slow degrees of freedom X t £ R K and the fast 
degrees of freedom Z t <E R K L . The dynamics are given by 

X t = F(X t ,Z t ), Z t = G{X u Z t ) 

with F, G being defined as 

F^(x,z) = ^(x^-A^-^+c-SW, (57) 
G\x,z) = -a lZ ^ +1 \z^ - z('-D) - a 2 z« W /L , (58) 

S<*> = X> (fc+i) , (59) 

1=0 

where k and I are to be understood modulo K and K ■ L, respectively, with 
K = 64 and L = 8. Furthermore, c = 18 for all k, a± — 100, a,2 = 10. The output 
was set to r/ t = CX t + pr t , with Cij = ^ij, with i — 1. . .K/2,j = 1 . . . K, 
that is, only every second slow variable was observed. The sampling interval 
of the observations was 0.005. Observational noise with standard deviation 1 
was added, corresponding to a SNR of around 14dB. The entire time series 
comprised 2048 points, hence tf — t s — 10.24. 
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For assimilation, we use a model without the fast degrees of freedom, that 

is, 

/ W ( x ) = X (fc-D (a,(*+i) _ x (k-2) ) _ x (k) + C) (60) 

and the same c as above. As a BVP-solver, the NAG Fortran library routine 
D02R AF was used, which is largely based on the techniques described bv lPerevra 
(1979). The results are shown in Figure [T2] This figure is to be interpreted as 
Figures HHTT1 for the Lorenz'63 case. Due to the wider variations of the quan- 
tities in the present case, the y-axis was scaled logarithmically. Additionally, 
the modeling error in the hidden degrees of freedom was calculated, and is rep- 
resented by the grey dash-dotted line. For decreasing a, this quantity does not 
increase as much as the overall modeling error Am- This is to be expected, since 
for small a the observed degrees of freedom are forced to follow the observa- 
tions, which requires larger perturbations. In general, the behaviour of Am, At, 
and Aa is very similar to the Lorenz'63 experiments. A notable exception is 
that the minimum of the overall assimilation error and the assimilation error of 
the observed components have their minimum in slightly different places. This 
suggests that in more complicated examples, more than one a is necessary. In 
other words, in the action integral A a (Eq. [7]), the matrices R,S need to be 
regarded as tuning parameters and their influence on the solution investigated. 
This will be subject of a future study. 



7 Conclusions 

In this paper, a variational approach to data assimilation in nonlinear dynamical 
systems was discussed. The necessary conditions for a minimum of the action 
integral can be written as a Hamiltonian two point boundary value problem. It 
was argued that in order to improve the condition of this two point boundary 
value problem, some form of regularisation is necessary. One possible option 
is to allow for some deviations from the proposed dynamical model. In anal- 
ogy to related problems from control theory, these deviations could be termed 
"controls" , but the word "dynamical perturbations" was used here. The mag- 
nitude of the dynamical perturbations is included as an additional term in the 
action integral, and the problem becomes one of minimizing a combination of 
the deviation from the observation and the deviation from the proposed model. 

The connection to several related papers was discussed, and the main find- 
ings in these papers were interpreted with in the framework of the present 
contribution. In particular, it was discussed why several approaches in the lit- 
erature work well, despite the fact that they do not allow for dynamical errors 
(which appears to contradict one of the main tenets of the present paper) . The 
solution is that, roughly speaking, the discussed algorithms in fact do allow 
for dynamical errors, although they do not appear explicitely in the problem 
formulation. 

Furt hermore, an exp l anatio n w as given to the (probably p aradoxical) find- 
ing of Ridout and Judd ( 2002 ) and Brocker and Parlit3 ( 200ll ) that several al- 



gorithms work well despite the fact that they, after having been initialized prop- 
erly, essentially ignore the observational data. This has been explained here as 
a form of continuation. 

Observations from a numerical experiment were reported. As a prop for re- 
ality, the Lorenz'63 and '96 systems were considered, subject to both dynamical 
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Figure 1: Trajectories and dynamical perturbations for Lorenz'63 experiment 
with SNR D = 5dB,SNR = 5dB, and a = 3.35 • 1CT 4 . Panels on the left 
hand side contain the three components of the original trajectory (grey) and 
the assimilated trajectory (black). Right hand side, upper panel, contains the 
dynamical perturbations over time. Right hand side, lower panel, contains ob- 
servations (black narrow line with diamonds), output y t — Cx t (black wide 
line), and noise-free observations CX t (grey wide line). In this particular case, 
observations and output are almost identical. This implies though that the ob- 
servational noise is assimilated into the solution, leading to deviations between 
the assimilated and true trajectories (left hand side panels). Furthermore, the 
dynamical perturbations are large compare to the vector held (right hand side, 
upper panel). 
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Figure 2: Trajectories and dynamical perturbations for Lorenz'63 experiment 
with SNRd = 5dB, SNRo = 5dB, and a = 0.5. Interpretation of the plot sym- 
bols as in Figure[TJ For this value of a, the assimilation error Aa was minimum. 
The output closely follows the noise-free observations, and the assimilated tra- 
jectories agree well with the true trajectories (left hand side panels). Further- 
more, the dynamical perturbations arc small compared to the vector field (right 
hand side, upper panel). 
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Figure 3: Trajectories and dynamical perturbations for Lorenz'63 experiment 
with SNR D = 5dB, SNR = 5dB, and a = 0.97. Interpretation of the plot 
symbols as in Figure [TJ The output still follows the noise-free observations, but 
some error in the form of phase slip is visible at around t — 8.5. The assimilated 
trajectories still agree to some extent with the true trajectories (left hand side 
panels), but the phase slip at around t = 8.5 is visible here, too. The dynamical 
perturbations are very small compared to the vector field (right hand side, upper 
panel), as expected. 
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Figure 4: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
3dB, SNR = 3dB are shown. 




Figure 5: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
4dB, SNR = 4dB are shown. 
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Figure 6: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
4dB, SNR = 6dB are shown. 




Figure 7: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
5dB, SNR = 5dB are shown. 
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Figure 8: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
6dB, SNRq = 4dB are shown. 




Figure 9: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
7dB, SNRq = 5dB are shown. 
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Figure 10: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
7dB, SNR = 7dB are shown. 




Figure 11: Tracking error At (dashed line), modeling error Am (dash-dotted 
line), assimilation error A a (solid black line), and the assimilation error of 
the observed part of the state A\ as a function of a. Results for SNRd = 
8dB, SNR = 6dB are shown. 
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Figure 12: This figure is to be interpreted as Figures HHTT1 for the Lorenz'63 
case. Tracking error At (dashed line), modeling error Am (dash-dotted line), 
assimilation error A a (solid black line), and the assimilation error of the ob- 
served part of the state A\ as a function of a. Additionally, the modeling error 
in the hidden degrees of freedom was calculated, and is represented by the grey 
dash-dotted line. Due to the wider variations of the quantities in the present 
case, the y-axis was scaled logarithmically. 
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and observational error. Observations of several (but not all) state variables of 
these systems were employed for data assimilation. The effects of a finite sam- 
pling time and the necessity for smooting were discussed. The problem of how 
to weight the tracking error versus the dynamical error in the action integral 
and the resulting effects on the solution was mentioned. 
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